HOT Task Manager - Task 91 project #15590 using SAR imagery for better waterways detection in the DRC¶

Introduction¶

Name: Bolognani Vanessa
Date: 14/01/2025

This Jupyter Notebook showcases the use of Synthetic Aperture Radar (SAR) imagery to detect waterways in regions where traditional optical satellite imagery might fall short. In areas like the Democratic Republic of the Congo (DRC), dense forests, cloud cover, and other environmental factors often obscure rivers, lakes, and streams, making it challenging to map these important features. SAR, however, offers a powerful solution by penetrating both cloud cover and dense vegetation, providing detailed information about the landscape, including water bodies.

In this notebook, we focus on Project: DRC Waterways: Kasaï 2023 (Part 4/7) #15590, Task 91, available through the HOT (Humanitarian OpenStreetMap Team) tasking manager.

Image 1 Image 2

The bounding box for this task area is defined by specific coordinates, marking the geographical region of focus for this project. Below are the coordinates for the bounding box:

  • Min Latitude (minlat): -5.386893
  • Min Longitude (minlon): 23.096929
  • Max Latitude (maxlat): -5.223223
  • Max Longitude (maxlon): 23.132675

The coordinates were extracted from JOSM. This bounding box covers a portion of the Kasaï region in the Democratic Republic of the Congo (DRC). The area enclosed by this bounding box can be visualized as a rectangle that represents the region to be studied. SAR imagery will be utilized within this area to detect waterways that are often obscured by dense vegetation or forest cover, which can make them difficult to identify using optical imagery alone. By applying SAR data, this project will help uncover these waterways, improving the accuracy and completeness of the OSM waterway network in this region.


Import Python Libraries¶

In [2]:
##Part 1: Acquire OSM data using Overpass API
#import relevant libraries
import overpy #to retrieve OSM data
from shapely.geometry import Point, LineString, Polygon, MultiPolygon, box #Convert overpy results into valid geom features. GeoPanda implements Shapely geometries
import geopandas as gpd #to work with Pandas GeoDataFrames
import lonboard #to visualize vector and raster data
from lonboard import Map, PolygonLayer, ScatterplotLayer, PathLayer, BitmapLayer

##Part 2: Aquire Satellite Imagery Data
from sentinelhub import SHConfig, WmsRequest, DataCollection, BBox, CRS
import ipywidgets as widgets #create interactive slider
from ipywidgets import VBox

Aquire OSM data using Overpass API¶

There are two main ways to access OSM data:

Image Description

In this notebook, OSM data will be retrieved via the Overpass API. More specifically, I will use the overpy library. The queries will be written using the Overpass Query Language and tested on Overpass Turbo.

Query the OSM Database¶

In [2]:
#Class to access the Overpass API
api = overpy.Overpass()

#Overpass query:
    ##Define global bounding box
    #Retrieve all nodes, ways and relations within the bounding box
    ##Return features id, geometry and tags
    
query = """
[out:json][bbox:-5.386893, 23.096929, -5.223223, 23.132675];
nwr;
out body;
"""

#Query the OSM Database using the Overpass API
result = api.query(query)

#Explore the results 
print(f"Number of nodes: {len(result.nodes)}")
print(f"Number of ways: {len(result.ways)}")
print(f"Number of relations: {len(result.relations)}")
Number of nodes: 67
Number of ways: 2
Number of relations: 1

Convert Overpy result in Shapely geometries¶

Nodes into Points¶

In [3]:
#Parse the result into a list of Point geometries
# List to store the nodes (as Points)
nodes = []

for node in result.nodes:
    # Create a point geometry from each node
    point = Point(node.lon, node.lat)
    nodes.append(point)

#Create a GeoDataFrame storing nodes geometries and set the Coordinate Reference System - OSM data collected using WGS84 (EPSG:4326)
gdf_nodes = gpd.GeoDataFrame(geometry=nodes, crs='EPSG:4326')

#Display the Point GeoDataFrame
display(gdf_nodes)
geometry
0 POINT (23.11597 -5.36569)
1 POINT (23.11621 -5.36560)
2 POINT (23.11648 -5.36606)
3 POINT (23.11728 -5.36664)
4 POINT (23.11698 -5.36685)
... ...
62 POINT (23.09898 -5.23739)
63 POINT (23.09844 -5.23767)
64 POINT (23.09810 -5.23772)
65 POINT (23.09748 -5.23728)
66 POINT (23.09702 -5.23753)

67 rows × 1 columns

Ways into Linestrings¶

In [4]:
# List to store the ways (as LineString) and polygons (as Polygon)
ways = []
polygons = []

# Iterate over the ways in the result
for way in result.ways:
    # Fetch the nodes for each way, resolving missing ones
    nodes_ways = way.get_nodes(resolve_missing=True)
    
    # Extract coordinates (lon, lat) for each node
    coords = [(node.lon, node.lat) for node in nodes_ways]
    
    if len(coords) > 1:  # A valid way needs at least two nodes
        # Check if the way is a closed loop (first and last coordinates should be the same)
        if coords[0] == coords[-1]:
            # Create a Polygon from the nodes' coordinates
            polygon = Polygon(coords)
            if polygon.is_valid:  # Only add valid polygons
                polygons.append(polygon)
        else:
            # Create a LineString from the nodes' coordinates
            line = LineString(coords)
            ways.append(line)

# Create GeoDataFrames for both ways and polygons
gdf_ways = gpd.GeoDataFrame(geometry=ways, crs='EPSG:4326')
gdf_polygons = gpd.GeoDataFrame(geometry=polygons, crs='EPSG:4326')

display(gdf_ways)  # Display the ways GeoDataFrame
display(gdf_polygons)  # Display the polygons GeoDataFrame
geometry
0 LINESTRING (23.05078 -5.25625, 23.05138 -5.255...
geometry
0 POLYGON ((23.11597 -5.36569, 23.11621 -5.36560...

Relations into Polygons/MultiPolygons¶

In [5]:
# List to store the relations (as LineString)
relations = []

# Iterate over the relations in the result
for relation in result.relations:
    # Initialize a list to store the coordinates for this relation
    relation_coords = []
    
    # Iterate over the members of the relation
    for member in relation.members:
        # Check if the member is a way
        if member.role == 'way':
            # Fetch the way (using 'ref' to get the specific way)
            way = result.get_way(member.ref)
            
            # Fetch the nodes for the way, resolving missing ones
            nodes_ways = way.get_nodes(resolve_missing=True)
            
            # Extract coordinates (lon, lat) for each node
            coords = [(node.lon, node.lat) for node in nodes_ways]
            
            # If the way has at least two nodes, add its coordinates to the relation's list
            if len(coords) > 1:
                relation_coords.extend(coords)
    
    # If the relation contains enough coordinates to form a LineString, create it
    if len(relation_coords) > 1:
        relation_line = LineString(relation_coords)
        relations.append(relation_line)

# Create a GeoDataFrame storing relations' geometries and set the Coordinate Reference System (WGS84)
gdf_relations = gpd.GeoDataFrame(geometry=relations, crs='EPSG:4326')

# Display the Relation GeoDataFrame. No valid relations were found
display(gdf_relations)
geometry

Visualize OSM data using Lonboard¶

To visualize the data, I will make use of the Lonboard library. This library allows for fast, interactive geospatial vector data visualization in Jupyter. In the background, Lonboard converts the data into the GeoParquet format, an optimized storage format for geospatial data.

In [11]:
##Visualize bounding box using the lonboard library

# Define the bounding box coordinates - coordinates retrieved from JOSM
minlat = -5.386893
minlon = 23.096929
maxlat = -5.223223
maxlon = 23.132675

# Create a bounding box (Polygon) using shapely's box function
bounding_box = box(minlon, minlat, maxlon, maxlat)

# Create a GeoDataFrame storing the bounding box coordinates
gdf_bbox = gpd.GeoDataFrame(geometry=[bounding_box], crs="EPSG:4326")

#Create Map layer to visualize bounding box
layer_bbox = PolygonLayer.from_geopandas(
    gdf_bbox,
    get_fill_color=[0, 0, 0, 0], #no fill
    get_line_color=[0, 0, 0, 255], #display using black outline
    get_line_width=50,
)

#Define Task 91 boundary - coordinates retrived from JOSM
coords = [
    (23.110913, -5.227001),
    (23.126144, -5.223223),
    (23.129866, -5.229228),
    (23.132675, -5.24745),
    (23.128977, -5.256582),
    (23.129946, -5.278238),
    (23.122123, -5.293487),
    (23.120868, -5.300179),
    (23.122401, -5.313886),
    (23.116915, -5.34567),
    (23.121349, -5.356179),
    (23.121333, -5.360712),
    (23.125283, -5.363605),
    (23.127728, -5.369857),
    (23.127551, -5.380092),
    (23.121329, -5.386705),
    (23.115152, -5.386893),
    (23.110043, -5.383415),
    (23.10791, -5.376104),
    (23.104621, -5.37368),
    (23.101743, -5.368516),
    (23.101253, -5.360301),
    (23.096929, -5.347627),
    (23.102419, -5.312069),
    (23.1009, -5.298146),
    (23.103792, -5.285336),
    (23.11002, -5.273845),
    (23.109009, -5.252853),
    (23.112356, -5.24486),
    (23.110182, -5.232768),
    (23.110913, -5.227001)
]

# Create a Polygon using shapely
polygon = Polygon(coords)

# Create a GeoDataFrame storing the polygon coordinates
gdf_task91 = gpd.GeoDataFrame(geometry=[polygon], crs='EPSG:4326')

#Create Map layer to visualize Task 91
layer_task91 = PolygonLayer.from_geopandas(
    gdf_task91,
    get_fill_color=[0, 0, 0, 0], #no fill
    get_line_color=[255, 0, 0], #display using red outline
    get_line_width=50,
)

##Visualize OSM data
#Create Map layer to visualize OSM nodes
layer_nodes = ScatterplotLayer.from_geopandas(
    gdf_nodes,
    get_fill_color=[255,255,0], #yellow fill
    get_line_color=[0, 0, 0], #black outline
    stroked=True,
    radius_scale=10,
)

#Create Map layer to visualize OSM ways
layer_ways = PathLayer.from_geopandas(
    gdf_ways,
    get_color=[135, 206, 235], #blue fill
    width_min_pixels=2,
)

#Create Map layer to visualize OSM closed ways
layer_polygons = PolygonLayer.from_geopandas(
    gdf_polygons,
    get_fill_color=[255,127,80] #orange fill
)

m = Map([layer_bbox, layer_task91, layer_nodes, layer_ways, layer_polygons])
m
Out[11]:
Map(custom_attribution='', layers=(PolygonLayer(get_fill_color=[0, 0, 0, 0], get_line_color=[0, 0, 0, 255], ge…

Aquire Satellite Imagery Data¶

SAR imagery can be accessed from various sources, including Sentinel-1 data, which provides VV polarization images. This code retrieves Sentinel-1 SAR imagery in VV polarization for the specified bounding box of Task 91. It uses the sentinelhub Python package, which is the official Python interface for Sentinel Hub services. The image is saved locally for further analysis.

To get client ID and secret, sign up here Follow the process highlighted in this page

To get instance id, sign up here

List of Data Collections can be found here

Authentication to access Sentinel Data via API¶

In [3]:
#Configuration to retrieve SAR data (Sentinel 1)
configSAR = SHConfig()

client_id ='' # Replace with your client ID
client_secret = '' # Replace with your client secret

configSAR.sh_client_id = client_id
configSAR.sh_client_secret = client_secret
configSAR.instance_id = '' # Replace with your instance ID
    
configSAR

#Configuration to retrieve True Color data (Sentinel 2)
configRGB = SHConfig()

configRGB.sh_client_id = client_id
configRGB.sh_client_secret = client_secret
configRGB.instance_id = '' # Replace with your instance ID
    
configRGB
Out[3]:
SHConfig(
  instance_id='',
  sh_client_id='',
  sh_client_secret='',
  sh_base_url='https://services.sentinel-hub.com',
  sh_auth_base_url=None,
  sh_token_url='https://services.sentinel-hub.com/auth/realms/main/protocol/openid-connect/token',
  geopedia_wms_url='https://service.geopedia.world',
  geopedia_rest_url='https://www.geopedia.world/rest',
  aws_access_key_id='',
  aws_secret_access_key='',
  aws_session_token='',
  aws_metadata_url='https://roda.sentinel-hub.com',
  aws_s3_l1c_bucket='sentinel-s2-l1c',
  aws_s3_l2a_bucket='sentinel-s2-l2a',
  opensearch_url='http://opensearch.sentinel-hub.com/resto/api/collections/Sentinel2',
  max_wfs_records_per_query=100,
  max_opensearch_records_per_query=500,
  max_download_attempts=4,
  download_sleep_time=5.0,
  download_timeout_seconds=120.0,
  number_of_download_processes=1,
  max_retries=None,
)

Retrieve SAR imagery¶

In [14]:
#Send a WmsRequest to retrieve SAR imagery
wms_sentinel1_vv_request = WmsRequest(
    data_collection=DataCollection.SENTINEL1_IW,  #Use Sentinel-1 data collection
    layer="IW_VV",  #Use the VV polarization layer (adjust if needed based on your WMS service)
    bbox=BBox(bounding_box, crs=CRS.WGS84),  #Bounding box in WGS84 coordinates
    time="latest",  #Use the latest available image  - "2025-01-13T14:34:50.126445"
    width=218, 
    height=1000,
    data_folder = r"C:\Users\Vanessa\Desktop\HOT Task 91 project #15590", #specify location where to save data
    config=configSAR,  
)

#Save data locally. Image png saved as 'response'
wms_sar_img = wms_sentinel1_vv_request.get_data(save_data=True)
C:\Users\Vanessa\AppData\Local\ESRI\conda\envs\myenv\lib\site-packages\sentinelhub\geometry.py:115: SHDeprecationWarning: Initializing `BBox` objects from `shapely` geometries will no longer be possible in future versions. Use the `bounds` property of the `shapely` geometry to initialize the `BBox` instead.
  x_fst, y_fst, x_snd, y_snd = self._to_tuple(bbox)

Retrieve True Color imagery¶

In [15]:
#Send a WmsRequest to retrieve SAR imagery
wms_sentinell2a_request = WmsRequest(
    data_collection=DataCollection.SENTINEL2_L2A,  #Use Sentinel2_L2A data collection
    layer="1_TRUE_COLOR",  #Retrieve True Color imagery (adjust if needed based on your WMS service - I used the premade Sentinel 2 WMS template)
    bbox=BBox(bounding_box, crs=CRS.WGS84),  #Bounding box in WGS84 coordinates
    time="2024-06-26",
    width=218, 
    height=1000,
    data_folder = r"C:\Users\Vanessa\Desktop\HOT Task 91 project #15590", #specify location where to save data
    config=configRGB,  
)

#Save data locally. Image png saved as 'response'
wms_rgb_img = wms_sentinell2a_request.get_data(save_data=True)
C:\Users\Vanessa\AppData\Local\ESRI\conda\envs\myenv\lib\site-packages\sentinelhub\geometry.py:115: SHDeprecationWarning: Initializing `BBox` objects from `shapely` geometries will no longer be possible in future versions. Use the `bounds` property of the `shapely` geometry to initialize the `BBox` instead.
  x_fst, y_fst, x_snd, y_snd = self._to_tuple(bbox)

Visualize Satellite data using Lonboard¶

In [16]:
#Create Map layer to visualize SAR satellite imagery
layer_sar = BitmapLayer(
    image="https://raw.githubusercontent.com/Vanessa-dev-spatial/portfolio-examples/main/response_sar.png", #'response' image png saved on GitHub as 'response_sar'
    bounds= [23.096929, -5.386893,23.132675, -5.223223]
)

#Create Map layer to visualize True Color satellite imagery
layer_rgb = BitmapLayer(
    image="https://raw.githubusercontent.com/Vanessa-dev-spatial/portfolio-examples/main/response_rgb.png", #'response' image png saved on GitHub as 'response_rgb'
    bounds= [23.096929, -5.386893,23.132675, -5.223223]
)

#add layers to Map - create new map to achieve correct order of layers
m = Map([layer_rgb, layer_sar])
m.add_layer([layer_bbox, layer_task91, layer_nodes, layer_ways, layer_polygons])
m
Out[16]:
Map(custom_attribution='', layers=(BitmapLayer(bounds=(23.096929, -5.386893, 23.132675, -5.223223), image='htt…
In [35]:
#display final map
m.as_html()
Out[35]:
Lonboard export

SAR vs True Image Comparison¶

In [26]:
# Create a slider to control the opacity (visibility) of the layers
slider = widgets.FloatSlider(
    value=0.5,
    min=0,
    max=1,
    step=0.01,
    description="Opacity",
    continuous_update=False
)

# Function to update the opacity based on the slider
def update_opacity(change):
    opacity = change['new']
    
    # Update opacity of the SAR and RGB layers
    layer_sar.opacity = opacity
    layer_rgb.opacity = 1 - opacity  # To create a "swipe" effect

# Link the slider to the update function
slider.observe(update_opacity, names='value')

# Display the map and slider
VBox([m, slider])
Out[26]:
VBox(children=(Map(custom_attribution='', layers=(BitmapLayer(bounds=(23.096929, -5.386893, 23.132675, -5.2232…

Conclusion¶

Sentinel-1 SAR imagery with VV polarization offers a promising approach for mapping waterways in regions like Kasaï, where dense vegetation often hides water features. Its ability to capture data regardless of weather or time of day makes it invaluable for improving the accuracy and completeness of waterway networks.